ASA: <u>A</u> ccelerating <u>S</u> parse <u>A</u> ccumulation in Column-wise SpGEMM
نویسندگان
چکیده
Sparse linear algebra is an important kernel in many different applications. Among various sparse general matrix-matrix multiplication (SpGEMM) algorithms, Gustavson’s column-wise SpGEMM has good locality when reading input matrix and can be easily parallelized by distributing the computation of columns output to processors. However, accumulation (SPA) step SpGEMM, which merges partial sums from each multiplications row indices, still a performance bottleneck. The state-of-the-art software implementation uses hash table for sum search SPA, makes SPA largest contributor execution time SpGEMM. There are three reasons that cause become bottleneck: (1) probing requires data-dependent branches difficult branch predictor predict correctly; (2) dependent on results probing, it hide latency; (3) collision time-consuming optimizations reduce these collisions require accurate estimation number non-zeros column matrix. This work proposes ASA architecture accelerate SPA. overcomes challenges executing accumulate with single instruction through ISA extension eliminate using dedicated on-chip cache perform pipelined fashion, relying parallel capability set-associative latency, (4) delaying merging overflowed entries. As result, achieves average 2.25× 5.05× speedup as compared Markov clustering application its kernel, respectively. hashing accelerator design, 1.95× kernel.
منابع مشابه
Decoding system for the AUA codon by tRNAIle with the UAU anticodon in Mycoplasma mobile
Deciphering the genetic code is a fundamental process in all living organisms. In many bacteria, AUA codons are deciphered by tRNA(Ile2) bearing lysidine (L) at the wobble position. L is a modified cytidine introduced post-transcriptionally by tRNA(Ile)-lysidine synthetase (TilS). Some bacteria, including Mycoplasma mobile, do not carry the tilS gene, indicating that they have established a dif...
متن کاملColumn wise DCT plane sectorization in CBIR
Content Based Image Retrieval (CBIR) is the application of computer vision techniques used to retrieve digital images from a large database. In this paper we have used the concept of sectorization of the plane formed form DCT transformed image. We have proposed an approach which involves augmentation of zero and highest row components of column-wise DCT transformed image for generating the feat...
متن کاملCBIR using Combined Feature Vectors of Column-Wise and Row-Wise DCT Transformed Plane Sectorization
Content Based Image Retrieval is a way of computer viewing technique used to retrieve digital images from a huge database. In this paper we have first calculated the feature vector column-wise and row-wise separately. After this we have concatenated the feature vectors of column-wise and row-wise. To evaluate the performance of the proposed method we have used Precision-Recall crossover point, ...
متن کاملA Polynomial Column-wise Rescaling von Neumann Algorithm
Recently Chubanov proposed a method which solves homogeneous linear equality systems with positive variables in polynomial time. Chubanov’s method can be considered as a column-wise rescaling procedure. We adapt Chubanov’s method to the von Neumann problem, and so we design a polynomial time column-wise rescaling von Neumann algorithm. This algorithm is the first variant of the von Neumann algo...
متن کاملMultiplicative Preservers of C-Numerical Ranges and Radii
Multiplicative preservers of C-numerical ranges and radii on certain groups and semigroups of complex n × n matrices are characterized. The general and special linear groups are considered, as well as the semigroups of matrices having ranks not exceeding k, with k fixed in advance. For a fixed C, it turns out that typically the multiplicative preservers of the C-numerical range (or radius) have...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Architecture and Code Optimization
سال: 2022
ISSN: ['1544-3973', '1544-3566']
DOI: https://doi.org/10.1145/3543068